比利时专利BE1026159B1 IMAGE PROCESSING SYSTEM AND IMAGE PROCESSING METHOD

专利PDF首页>>比利时专利

专利附录

专利说明

权利要求

类似技术

同族专利

引用文献

法律状态

优先权

专利摘要:
Image processing system and image processing method for locating recognized characters in an image. An estimation means is configured to estimate a first location of a recognized character which has been obtained by performing a recognition of characters of the image. A determination means is configured to determine the second locations of a plurality of connected components in the image. A comparison means is configured to compare the first location and the second locations, in order to identify a connected component associated with the recognized character. An association means is configured to associate the recognized character, the connected component identifies and the second location of the connected component identifies.
公开号:BE1026159B1
申请号:E20195180
申请日:2019-03-22
公开日:2020-05-08
发明作者:Frederick Collet；Jordi Hautot；Michel Dauw
申请人:Iris Sa；
IPC主号:

专利说明:

IMAGE PROCESSING SYSTEM AND IMAGE PROCESSING METHOD [0001] Technical Field [0002] The present invention relates to an image processing system and an image processing method, and in particular relates to the location of recognized characters. in a picture.
State of the art [0004] Character recognition is carried out to convert text included in an image into machine-coded text. Images that can be analyzed using character recognition software include a scanned document, a photograph of a document, a photograph of a scene, a video recording, and text that has been overlaid on a document. . Image text that can be converted includes typed, handwritten, and printed text.
Machine-coded text includes any character coding standard for electronic communications, such as ASCII and Unicode. Alternatively, the machine-coded text includes a plurality of reference glyphs which have been obtained using the coded standard. In addition, the machine-coded text can include both the machine-coded standard and the plurality of reference glyphs.
The character recognition software is configured to receive an image as input and to exit the machine-coded text. The term character recognition refers to the identification and recognition of individual characters in the image. However, the term character recognition is also used to
BE2019 / 5180 understand word recognition, where identification and recognition occurs one word at a time. Character recognition is illustrated by optical character recognition, optical word recognition, intelligent character recognition and intelligent word recognition.
Character recognition is adapted according to the writing system included in the document, such as the Latin, Cyrillic, Arabic, Hebrew, Indian, Bengali, Devanagari, Tamil, Chinese, Japanese, Korean, Emoji, Morse and other characters. braille. Character recognition is further adapted according to the language of the text included in the image. The writing system and the language of the text can be identified by the user, or the writing system and the language of the text can be identified by the character recognition software from the context of the characters and words which are recognized. In addition, character recognition can be adapted to process documents that contain text in several writing systems or languages.
Character recognition occurs by associating machine-coded characters with at least one example of a reference glyph that could be detected in an image. The accuracy of character recognition is improved by increasing the number of reference glyphs that represent a machine-encoded character. This is particularly useful for improving the accuracy of recognition of a variety of fonts or handwriting styles.
There are a number of conventional techniques for comparing a character identified in the image with the reference glyphs, such as matrix matching and feature extraction. Matrix matching involves a comparison of the
BE2019 / 5180 pixel pattern of the character identified with the pixel pattern of the reference glyphs. Feature extraction breaks down the input character into features such as lines, closed loops, line direction and line intersections, and these extracted elements are then compared to the corresponding features of the reference glyphs.
Another available technique is intelligent recognition, which is obtained by using machine learning to form a computer system which uses a neural network. Intelligent recognition improves the recognition of characters that do not correspond to the reference glyphs. Character strings, such as words used in sentences, provide contextual information to the neural network, so character recognition is suitable for recognizing words that are difficult to recognize in isolation. In addition, the neural network can be trained so that low quality images can be recognized precisely. The neural network is formed by entering representations of the characters to be recognized. The training phase performs a gradient descent technique so that the neural network is optimized by reducing output errors. Machine-coded text output is based on a probability measure from a comparison with the text samples that entered during the training phase. The anticipatory processing of the neural network is carried out so that there is convergence towards the probability measure. The neural network is used to adapt the character recognition so that it can perform the recognition of character characters which were not encountered during the formation of the neural network.
The position of machine-coded characters is generally identified because the identified character is made up of components
BE2019 / 5180 known connections. However, individual characters are not identified during intelligent character recognition with a neural network, because a character or word is identified as a whole. In addition, recognition of individual characters is not a reliable measure if the image is of poor quality, for example if perspective issues are taken into account, which leads to inconsistency in the size and orientation of the image. text. Therefore, the use of neural networks increases the accuracy of character recognition, but reduces the accuracy of estimating the position of characters in the image.
Consequently, there is a request to improve the estimation of the position of the characters which have been recognized in an image. It would be useful to improve the estimation of the position of the characters independently of the character recognition technique used, although this is particularly useful in situations where the character recognition technique uses a neural network.
Summary of the Invention The aspects of the present invention are defined by the independent claims.
According to a first aspect, an image processing system is provided according to claim 1.
According to a second aspect, an image processing method is provided according to claim 16.
According to a third aspect, a program is provided according to claim 17.
BE2019 / 5180 According to a fourth aspect, a computer-readable medium is provided according to claim 18.
Brief Description of the Drawings [0020] Embodiments will now be described, by way of example only, with reference to the accompanying drawings, in which:
Figure 1 provides a schematic diagram which illustrates an image processing system for identifying a position of at least one character in an image;
Figure 2 provides a flowchart illustrating an image processing method for identifying a position of at least one character in an image;
Figures 3A-C provide flow charts illustrating how an association is made between recognized characters and connected components;
Figure 4 provides an example which illustrates an image whose position is identified for a plurality of characters of the image;
FIG. 5A-C provides an example which illustrates the character recognition carried out to recognize characters from the image;
Figure 6A-C provides an example which illustrates a plurality of connected components identified in the image;
Figure 7 provides an example which illustrates a plurality of characters which have been positioned as a function of the position of the plurality of connected components; and
BE2019 / 5180 FIG. 8 provides a table which includes examples showing how the connected components are associated with the recognized characters.
Detailed description [0030] Various embodiments, characteristics and aspects of the invention will be described in detail below with reference to the drawings. Each of the embodiments of the present invention described below can be implemented alone or as a combination of a plurality of embodiments or features thereof if necessary or when the combination of elements or mode features of individual embodiments in a single embodiment is beneficial.
Figure 1 is a schematic diagram which illustrates an image processing system 100 for identifying a position (location) of at least one character in an image. The image processing system 100 is generally used to identify the position of a plurality of characters which are arranged on one or more lines of text. The identification of the position of the plurality of characters can also be called the location of the characters recognized in the image.
The image processing system 100 comprises an input 111 and an output 112, a character recognition unit 120, a processor 140 and a memory 150. The processor 140 comprises an estimation means 141, a means of determination 142, comparison means 143, association means 144, training means 145 and augmentation means 146.
BE2019 / 5180 The character recognition unit 120 uses the conventional techniques available to recognize the text in an image. The character recognition unit 120 performs recognition using a neural network 130 which has been trained to recognize a plurality of strings, such as words used in context.
The recognition of characters in an image generally results in the recognition of a plurality of characters that make up the text contained in the image. However, the character recognition software can also operate to recognize a single character contained in the image. Therefore, this disclosure explains the specific case in which the position of a single character is identified. In addition, it is intended that this technique can be generalized to allow the identification of the position of a plurality of characters which are contained in the image. Consequently, the recognized character or characters are located in the image, which makes it possible to improve the accuracy of the location (position) of the recognized characters.
The processor 140 executes software which serves as an estimation unit 141, a determination unit 142, a comparison unit 143 and an association unit 144. The estimation unit 141 is configured to estimate a first location of a recognized character which was obtained by performing character recognition of the image. The determination unit 142 is configured to determine the second locations of a plurality of components connected in the image. The comparison unit 143 is configured to compare the first location and the second locations, in order to identify a connected component associated with the recognized character. The association unit 144 is configured to associate the character
BE2019 / 5180 recognized, the connected component identified and the second location of the connected component identified.
Therefore, the position of the connected component of the character is associated with the position of the machine-coded character. Therefore, the recognized character is located in the image, the position of the character being improved based on the character in the image, instead of using an estimate resulting from the region of the image that was used for perform character recognition.
The processor 140 executes the software to also serve as a training unit 145, which forms the neural network using the second location associated with the recognized character. It is possible for the user to make corrections to character recognition. The position of the recognized character provides the context for the identified character image, and therefore the position information is useful for teaching the neural network. As a result, the neural network can learn from corrections made by the user. The precise position of the recognized character is not compulsory for the user to modify the output text and therefore feed the learning. That said, precise knowledge of the position of recognized characters is useful because it allows more precise graphical interfaces and improves the user experience with the software.
The processor 140 executes the software to also serve as an augmentation unit 146 which increases the image to understand the character recognized in the second location. The recognized character is a machine-coded representation of the connected component. For example, the recognized character could be a machine-coded representation of a character provided by a coding standard for
BE2019 / 5180 electronic communications, such as ASCII and Unicode. The recognized character can also be a reference glyph which was obtained using the coded standard. In addition, the recognized character could include both the machine-coded representation and the reference glyph.
The provision of a character recognition unit 120 is not an essential characteristic of the image processing system 100, since the image processing system 100 could be used to determine the position of the text for which character recognition has already been performed. For the example illustrated in FIG. 1, the image processing system 100 provides at least one character recognition unit 120.
The character recognition unit 120 may include a plurality of character recognition units 120 comprising a first character recognition unit and a second character recognition unit, and may include other character recognition units. characters. Each character recognition unit 120 performs the function of identifying the characters in a region of an image and associating the identified characters with the machine-coded text. The characters in the image are identified and recognized from the analysis of the pixels in the image region. Characters can be recognized in a selection of languages, in a variety of fonts.
The use of a plurality of different character recognition units 120 makes it possible to adapt the character recognition units 120 in order to optimize the recognition of characters for specific conditions. The quality of the image, the language of the text, the font of the text, if the text is
BE2019 / 5180 typed or handwritten, and available IT resources are examples of specific conditions.
The input 111 and the output 112 are configured to receive and transmit electronic data. The input 111 is configured to receive the image to be analyzed, for example from a local network, from the Internet or from an external memory. In addition, the input 111 is configured to receive instructions from a user via, for example, a mouse, a touch screen or a keyboard. Input 111 serves as a selection means configured to allow a user to select characters recognized in the second location of the corresponding connected components in the image. The output 112 is configured to output the identified text. The output 112 includes a display making it possible to identify the text for the user. Output 112 includes a network connection for communicating over the Internet.
The image processing system 100 is illustrated by a single image processing device 100 which includes a single character recognition unit 120. As an alternative, the image processing system 100 could include a plurality of character recognition units 120. In addition, the characteristics of the image processing system 120 could be distributed over several different devices, such as a plurality of image processing devices, each having a character recognition unit. .
The characteristics of the image processing device 100 can be arranged differently. For example, each of the character recognition units 120 may include a processor 130 configured to serve as a determination unit 142, an association unit 143, an identification unit 144, a recognition unit increase 145 and a training unit 146. Plurality
BE2019 / 5180 of character recognition units 120 can be part of the same device or be distributed as a system on a plurality of devices.
The image processing device 100 can be part of a personal computer. The image processing device 100 may also be part of a multifunction device, further comprising a scanner, a copier, a fax machine and a printer.
Figure 2 is a flowchart illustrating an image processing method S200 for identifying a position of a character in an image 400. A typical image 400 for which character recognition must be carried out is a document which includes lines of text. The image processing method S200 is implemented by the image processing system 100. A program which, when implemented by the image processing system 100, allows the image processing system to perform the image processing method S200. Computer readable media stores the program.
In step S210, the line segmentation is performed on the image. Thanks to line segmentation S210, a plurality of lines of text are identified in the image.
A variety of techniques are available for performing S210 image segmentation, which is the process of partitioning a digital image into several segments. A number of techniques are available to perform line segmentation S210. The image is binarized to transform a gray or color scale into a binary image. A line segment is identified if it has the expected characteristics of a line of text, such as detecting edges of shapes conforming to a line or the
BE2019 / 5180 determination that individual characters or individual words are grouped together.
The line segmentation step S210 can perform line recognition for the lines of text which are arranged according to a number of orientations. The orientation in which the text is arranged depends on the presentation of the document in portrait or landscape. In addition, the text can be arranged at an angle, especially if the document contains handwriting. Text lines are generally arranged horizontally, with the text read from left to right. However, in some languages, the text is read from right to left. In addition, in some languages, the lines are arranged vertically instead of horizontally.
Line segmentation can be performed by the character recognition unit 120 as part of the character recognition step S220. Alternatively, line segmentation can be performed by processor 140.
Note that step S210 is not essential, because the line segmentation can be carried out before the image processing method S200, the character recognition being carried out for an image 310-330 which has already been segmented in one line of text. Line segmentation S210 does not need to be performed by image processing device 100, since line segmentation could be performed by another device.
The character recognition unit 120 performs additional segmentation of the image. Therefore, each line of text is segmented into a plurality of regions.
BE2019 / 5180 In step S220, character recognition is performed on each of the regions.
Each of the regions contains an individual character or an individual word. Each region of the image can be identified by one or more coordinates of this region, such as the center of the region or a corner of the region.
The character recognition unit 120 includes a neural network 130 and operates by determining activations which represent a probability value that the output of the character recognition has converged. Activations provide machine-coded character values that have been recognized above a confidence level. An activation is an output from the neural network, and for example, could specify that there is a 95% probability that the letter has been recognized.
The identification of the position of the machine-coded characters is particularly useful in the case where the character recognition unit 120 comprises a neural network 130, because the intelligent character recognition does not generate a precise position of the characters in image 400.
Consequently, the identification of the positions of the characters is carried out once the recognition of the characters has ended.
Note that step S220 is not essential, because the character recognition can be carried out before the image processing method S200, the position being determined for the machine-coded characters which have already been recognized from of the characters identified in image 400. Character recognition S220 does not need to be carried out by the
BE2019 / 5180 image processing 100, because the character recognition could be carried out by another device.
In step S230, an estimate of the position of each character recognized in the image is made. The estimate provides a coordinate corresponding to the region of the image that was used to perform character recognition by the character recognition unit 120. Thus, the region used to perform character recognition is used to provide an approximation the position of the character.
The image is a two-dimensional array of pixels, the position of which is addressed by a coordinate system. The horizontal position is addressed by an x coordinate. The vertical position is addressed by a y coordinate.
The estimated position can be specified by the x and y coordinates. However, for a particular line of text, it is not necessary to specify the vertical position, but it is possible to estimate the position using only the horizontal position. Therefore, a single x coordinate can be used to estimate the position of the recognized character in a line segment.
In step S240, a coordinate is determined for each component connected in the region of the image identified in step S230.
A connected component is a part of the image for which the adjacent pixels of the image have the same color, for example groups of pixels in contact which are black. Connected components can be identified in the image because the image is binarized so that the text appears in black characters on a white background. A
BE2019 / 5180 conventional technique is used to binarize the image, for example to determine if a characteristic has been detected having a gray scale value or a color scale value which is greater than a threshold.
A connected component generally corresponds to a single character such as the letter T, for which all the pixels of the character are connected together. Some characters consist of a plurality of connected components, such as the letter i, which includes two connected components. It is also possible that a plurality of characters form a single component, for example characters linked together by a ligature or an underscore. The correspondence between the characters and the connected components depends on the writing system and the language, as well as on the particular font used in the image.
Consequently, each connected component is associated with a coordinate. The coordinate can correspond to a part of the connected component itself, such as the left-most point of the connected component, the right-most point of the connected component or a central point of the left-most point and the most right. Alternatively, the coordinate can correspond to a part of a bounding box which contains the connected component, such as the center of the bounding box, or a barycenter.
It is possible to specify an x coordinate and a y coordinate representing the horizontal position and the vertical position in the image of the connected component. However, for a particular line of text, it is not necessary to specify the vertical position, but it is possible to determine the position of the connected component using only the horizontal position. Therefore, only one x coordinate can be used to determine the position of the
BE2019 / 5180 connected component. The use of a single coordinate simplifies the calculation by comparing the positions on a single dimension.
In step S250, the recognized characters obtained in step S220 are associated with the detected connected components determined in step S240. Consequently, the estimation of the position of the recognized character is replaced by the position of the connected component. As a result, recognized characters are assigned a position based on the text identified in the image.
Following step S250, each recognized character is associated with a connected component, based on a comparison of their coordinates. The recognized character is associated with the nearest connected component, because the closest connected component is most likely to correspond to the recognized character.
The distance between a recognized character and a connected component is defined as the difference between the coordinates associated with the recognized character and the connected component. The closest coordinates correspond to the minimum distance between the recognized character and the connected component. For a particular line of text, it is possible to calculate the distance using only the x coordinate, which simplifies the calculation, because the positions are compared using only one dimension.
If the distance between the recognized character and the connected component is less than a threshold, this indicates that it is likely that the recognized character has been associated with the correct connected component. However, it may be difficult to associate the recognized characters with the connected components. If there are several possibilities, a recognized character is associated with the connected component which is
BE2019 / 5180 to his left. If there are unassigned connected components, these are associated with their nearest recognized character.
Further details on step S250 are provided by the disclosure of Figures 3A-C.
FIG. 3A provides a first flow diagram illustrating how an association is made between the recognized characters and the connected components S250.
In step S251, the comparison unit (comparison means)
143 identifies the connected component having a second location which is closest to the first location of the recognized character. Thus, the recognized character is associated with its nearest connected component.
In step S252, the comparison unit identifies any other connected component which is not associated with any recognized character and identifies the recognized character closest to the other connected component. Therefore, if there are one or more connected components not assigned, in step S254A, it will be associated with its nearest recognized character.
In step S253A, a blob is created for each recognized character. Each blob includes the recognized character, as well as the connected component. For the situation where there were one or more connected components not assigned in step S252, each blob includes the recognized character, as well as one or more other connected components.
In step S254A, the association unit (association means)
144 associates the recognized character, the connected component identified and the
BE2019 / 5180 second location of the identified connected component. For the situation in which, in step S252, an unassigned connected component is associated with its closest recognized character, the association unit 144 associates the closest recognized character, the connected component identified with the recognized character the closest, the other connected component and the second location of the identified connected component. The number of connected components that can be associated with a recognized character is not limited to two.
FIG. 3B provides a second flow diagram illustrating how an association is made between the recognized characters and the connected components S250. FIG. 3B presents a technique making it possible to manage the situation in which a connected component corresponds to a plurality of recognized characters, this situation occurring if the text of the image comprises an underlined line or a ligature.
The steps S251-S252 in Figure 3B are the same as those in Figure 3A. Step S252 provides for the possibility that unassigned connected components are associated with the nearest recognized character, even in the case where the connected component corresponds to a plurality of recognized characters.
In step S255B, it is evaluated whether the connected component of step S251 is associated with a plurality of recognized characters.
If the connected component is associated with a plurality of recognized characters (YES in step S255B), the method progresses to perform steps S253B and S254B.
If the connected component is not associated with a plurality of recognized characters (NO in step S255B), this corresponds to the
BE2019 / 5180 situation of FIG. 3A, and steps S253A and S254A are therefore carried out.
In step S253B, a merged blob is created for each recognized character. Each merged blob includes the plurality of recognized characters, as well as the connected component. For the situation in which there were one or more connected components not assigned in step S252, each blob includes the plurality of recognized characters, as well as one or more other connected components.
In step S254B, the association unit (association means) 144 associates the plurality of recognized characters, the identified connected component and the second location of the identified connected component. For the situation in which, in step S252, an unassigned connected component is associated with its nearest plurality of recognized characters, the association unit 144 associates the closest plurality of recognized characters, the identified connected component of the nearest plurality of recognized characters, the other connected component and the second location of the identified connected component.
FIG. 3C provides a third flow diagram illustrating another embodiment in which an association is made between the recognized characters and the connected components S250. Figure 3C provides an alternative method to Figure 3B for managing the situation where a connected component corresponds to a plurality of recognized characters.
The steps S251-S252 in Figure 3C are the same as those in Figure 3A.
BE2019 / 5180 Step S255C in Figure 3C is the same as step S255B in Figure 3B. Consequently, it is evaluated whether the connected component of step S251 is associated with a plurality of recognized characters.
If the connected component is associated with a plurality of recognized characters (YES in step S255C), the method progresses to perform steps S256C, S253C and S254C.
If the connected component is not associated with a plurality of recognized characters (NO at step S255C), this corresponds to the situation in FIG. 3A, and steps S253A and S254A are therefore carried out.
In step S256C, a division unit (division means) divides the connected component into a plurality of parts. The number of parts corresponds to the number of recognized characters. The connected component is divided so that each part corresponds to one of the pluralities of recognized characters. The division can be performed according to the location of the recognized characters, so that the connected component is divided into several connected components having the same number as the recognized characters. The comparison means is configured to identify a location for each of the parts.
In step S253C, a plurality of divided blobs are created. A split blob is created for each recognized character. Each blob includes a recognized character, as well as the corresponding part of the connected component. In the case where there were one or more connected components not assigned in step S252, it is possible that a divided blob includes a recognized character, the part
BE2019 / 5180 corresponding component, as well as one or more other connected components.
In step S254C, the association unit (association means) 144, for each of the parts, associates the recognized character, the part of the connected component and the location of the part. In the case where, in step S252, an unassigned connected component is associated with its closest recognized character, the association unit 144 can associate the closest character, the identified part of the connected component of the recognized character nearest, the other connected component and the location of the part of the connected component identified.
Returning to FIG. 2, after step S250 has been carried out, as illustrated by FIGS. 3A-C, the image processing method S200 goes to step S260.
In step S260, the memory 150 stores in association with the image 400, the recognized character and the second location of the identified connected component.
Note that step S260 is not essential, because the processor 140 can display to the user the output of the image processing method S200. The storage in a memory 150 can take place after the completion of the image processing method S200. The storage does not need to be performed by a memory of the image processing device 100, since the output of the image processing method S200 can be stored by a different device.
If it is determined in the image processing method S200 that an error has occurred in the identification of the position of a character, this indicates that the character has been incorrectly associated with
BE2019 / 5180 a connected component. This indicates that there is an error in the mapping between the image and the machine-encoded character. In this case, the image processing method S200 returns to the association step S250, so that the character can be associated with a different connected component.
To do this, the image processing system 100 shown in Figure 1 can be further improved by providing an error detection unit. The error detection unit is configured to detect that the recognized character and the connected component do not match. This characteristic is used to determine that the determination of the position of the recognized character of step S260 is incorrect. Therefore, the error detection unit returns the image processing unit to step S250. Error detection can be performed by the error detection unit by comparing the input image with the machine-coded text output, to determine whether the location of the characters identified in the image matches the location of the matching recognized characters. As an alternative, error detection can be performed by the user.
The disclosure of the image processing method S200 specifies the situation in which each character is associated with a unique position. However, as an alternative, each character identified in the image can be associated with a plurality of positions. This is useful for specifying the position of the character in more detail. For step S230, the position of each recognized character is associated with a plurality of coordinates, each coordinate corresponding to a different part of the recognized character. For step S240, the position of the connected component is associated with a plurality of coordinates, each coordinate corresponding to a different part of the connected component. For step S250, the parts of the character
BE2019 / 5180 recognized are associated with the parts of the connected component. For step S260, the position of the parts of the recognized character is determined using the coordinates of the parts of the connected component. Thus, the accuracy of the shape and orientation of the connected component is improved.
The image processing method S200 can be executed by a number of different devices or by a number of different users executing the individual steps S210-S260. In particular, the line segmentation and character recognition steps S210-S220 could be performed separately from the other steps of the image processing method S200. Consequently, it would be possible to adapt an existing character recognition unit 120 so that it can be used in the context of the disclosed image processing system 100.
Associating the recognized character with the connected component S250 makes it possible to determine the position of the determination S260, even in the case where the quality of the image is low. Examples of low quality images:
- images whose recognized characters have a different height, which can occur when a document has been scanned at an angle, for example when a camera takes a photo of the document;
- images whose text line is not uniform, which can happen when a document is not placed flat when scanning; and
BE2019 / 5180
- images for which characters cannot be recognized with precision in the image, which can occur if the document contains text which is not uniform.
To do this, the image processing system 100 shown in Figure 1 can be further improved by providing a perspective detection unit. The perspective determining unit is configured to determine whether or not the perspective of the image affects the position of the character. Thus, the image processing system 100 can identify the position of a plurality of characters whose height or width differ. To do this, you can measure the height or width of the characters in a line of text to determine if there is a change in these measurements when moving over the line of text.
As a result of character recognition, a mapping is provided between the characters in the image and the machine-coded characters output by the character recognition performed on the image. It is useful for mapping to specify the position in the image of the characters. An advantage of associating the machine-coded characters with the position of the characters in the image is that the machine-coded characters can be arranged to correspond to the original image. This is useful for displaying machine-coded characters to the user so that they match the text in the image. In addition, the overlay of the image with the machine-coded characters allows the user to select the text. It is useful to provide machine-coded text to provide a search function by enabling the search for machine-coded text. Overlaying text with recognized characters is useful for displaying text to the user, for example, the situation in which recognized characters are translated into a
BE2019 / 5180 other language, so that the translated text can be displayed with the original image for comparison.
FIGS. 4-7 illustrate an example of the image processing method S200 performed for an image 400, to identify the position of a plurality of characters in the image 400.
[00103] Figure 4 shows an image 400 which must be subjected to character recognition. The image includes three lines of text 410-430, showing a typical example of the horizontal and vertical arrangement of characters in an image 400. The quality of image 400 is poor, which can be seen from the fact that:
- Characters that are not completely scanned, such as character 2 on the top line of text 410 being in two separate pieces.
- The characters being connected as we can see for the text '' +3 on the upper line of the text 410 being linked together by an incorrect connection line.
- The scanning includes a gray background, which can make it difficult to distinguish the connected elements of the text in the image. Image 400 shows a gray background surrounding the text, due to the poor quality of scanning. Note that performing the image binarization should remove the gray background. It is therefore possible to distinguish the text from the background.
FIGS. 5A-C illustrate how the machine-readable characters are extracted from the original image. Lines 410-430 of Figures 5A-C correspond to the lines of text identified in the image
BE2019 / 5180
400 in Figure 4. Below each line of text is the illustrated output 510-530 of the character recognition unit 120.
The 510-530 output of the character recognition comprises a plurality of machine-coded characters, each provided with an activation value. The activation value is illustrated in Figures 5A-C by an activation band which indicates how the activation value is used to determine the machine coded character.
An activation value indicates the probability that a character has been identified. The activation value is the probability that the identification of the machine-readable character is accurate.
The activation strip provides a visual display of the activation value corresponding to the machine-readable character. Each of the machine-readable characters corresponds to the character identified as being in the image, although the position of the machine-readable character has not been determined.
For each line of text, the character recognition unit 120 digitizes the image from left to right. Each row is considered one column at a time, thus sweeping the row. Each column includes a vertical group of pixels that are part of the row. As each column is scanned, the information obtained is used to determine the activation value associated with the characters that are identified.
In Figures 5A-C, the activation strips have two colors, being shown in black to the left of the activation strip and in gray to the right of the activation strip. When the activation strip is black, this means that the activation value of the
BE2019 / 5180 given character is greater than the activation value of any other character. When the activation band is gray, this means that the activation value is better than the other activation values of the other characters, but not greater than the null activation value. Therefore, the activation bands shown in Figures 5AC demonstrate that when each character identified in the image is read on the line from left to right, the activation value increases, to ensure that the confidence level of the machine coded character is greater than the required threshold.
If we take for example the letter T on line 410 of FIG. 5A, the output 510 comprises the character coded by machine T as well as an associated activation strip. When this activation strip is black, this indicates that the letter T has been recognized with an activation value indicating that it is greater than the activation of any other character. When the activation band turns gray, this indicates that enough columns have been read to determine that the letter T has been identified with a probability that exceeds a required threshold.
The output of the machine coded character T makes it possible to estimate the position of this character in step S230. To do this, it is determined that the letter T must be positioned in the region of the image which has been used to recognize the character T.
Figures 6A-C show each of the segmented lines. There are a variety of techniques for segmenting a line of text into connected components. Connected component analysis is an algorithmic application of graph theory, where subsets of connected components are uniquely labeled. The algorithms can perform one connected component at a time, so that once the first pixel of a connected component is found, all of the connected pixels of that connected component are
BE2019 / 5180 labeled as part of this connected component, before moving on to the next connected component in the image. Alternatively, a double pass technique can be performed, first passing over the image to assign temporary labels, and then passing over the image to perform connectivity checks.
For each of the connected components which has been identified, a delimitation box has been provided around the connected component. Thus, the image is divided into a plurality of regions which are defined by the bounding boxes. By way of example, the character T of FIG. 6A is surrounded by a delimitation box 610, which shows that a connected component has been identified. Likewise, FIG. 6B shows the character F surrounded by a delimitation box 620, and FIG. 6C shows the character E surrounded by a delimitation box 630.
Figures 6A-C show that the connected components do not always correspond exactly to the machine-readable characters. Therefore, the connected components can be merged or split if appropriate.
It is possible that a plurality of different connected components correspond to a single character. There are many characters for which one can expect a plurality of connected components. For example, Figures 6A-C indicate the character is identified as having two different connected components 633-634. Likewise, Figure 6C shows the character i identified as having two different connected components 631-632. Other examples include characters that include accents or umlauts.
BE2019 / 5180 In addition, in particular for a low quality image, it is possible that an error results in a plurality of different connected components corresponding to a single character. This is shown in Figure 6A, where character 2 is identified as having two different connected components 615-616, as shown by the two bounding boxes that separate this character into two distinct regions. Likewise, Figure 6A shows the character e identified as having three different connected components 611-613.
On the other hand, it is possible that a single connected component can correspond to a plurality of characters. For example, Figure 6A shows that the text +3 is part of the same connected component 614 because scanning shows that these characters are connected by an incorrect line due to the low quality of scanning. However, this error did not occur in Figure 6B, because these connected components 621-622 are not interconnected.
Note that there are situations in which a plurality of characters are correctly identified as being connected to form a single connected component, such as diphthongs and ligatures. Another reason why characters must be connected to form a single connected component is due to the fact that the characters are underlined, which, depending on the font used, can cause the characters to be assembled below the line of text.
The term blob can be used to designate a recognized character associated with one or more connected components. An assessment is made to determine whether a blob should be split, so that the connected component can be associated with a plurality of recognized characters.
BE2019 / 5180 The output of the blob can be used to determine a coordinate in step S240 which must be used for the position of the corresponding machine-coded character. The coordinate is placed near the connected component, for example in the upper left corner of the bounding box. In addition, a plurality of coordinates can be placed near the connected component, for example around the perimeter of the bounding box.
FIG. 7 illustrates the result 700 which is output after the character segmentation algorithm has been performed on the image. Machine-readable text is positioned based on the position of the connected components that have been identified. The machine-readable characters are displayed on a grid in positions based on the result of the execution of the image processing method S200 presented in FIG. 2.
The output 700 of Figure 7 shows a number of crosses which represent the coordinates used to place the machine-coded characters in the desired position.
By taking for example the letter T on the upper line of FIG. 7, this character is surrounded by a plurality of position markers, which are used to determine the position of this character in step S260. The machine-coded letter T was set up using the coordinates associated with the connected component of the identified character T in the image. The letter T bounding box includes a plurality of position markers around its perimeter. It would be possible to determine the position of the machine-coded character using a single coordinate which corresponds to the connected component, although the use of a plurality of coordinates improves the accuracy of the position of the letter T, and ensures that it this is placed so as to have the correct orientation.
BE2019 / 5180
Knowing the position of each pixel in the connected components of the blob would provide the most accurate description possible of the position of the blob, although for convenience and to improve the speed of execution of the software, this is simplified by using a box. delimitation or a barycenter.
Consequently, the machine-readable characters shown in the image 400 illustrated in FIGS. 3-6 form three lines of text 410-430 which read:
Phone: +32 (0)
Fax: +32
E-mail :
The positioning contains some errors, so all the machine-readable characters are not in the correct position.
FIG. 8 provides a table which includes examples showing how the connected components are associated with the recognized characters.
Line 810 of the table shows a single connected component and a single recognized character forming a blob. The connected component 610 is associated with the machine coded character T, which together forms the blob. This blob is created by step S253A, in the case where step S252 does not provide an association with an unassigned connected component.
Line 820 of the table shows a plurality of connected components and a single recognized character forming a blob. The connected components 633-634 are associated with the machine-coded character which together form the blob. This blob is created by the step
BE2019 / 5180
S253A, for the situation where step S252 provides association with the unassigned connected component 633. This example provides a machine-coded character which should be composed of a plurality of connected components.
Line 830 of the table shows a plurality of connected components and a recognized character forming a blob. The connected components 615-616 are associated with the machine-coded character 2, which together form the blob. This blob created by step S253A, for the situation in which step S252 provides an association with the unassigned connected component 616. This example provides a machine-coded character which should be composed of a single connected component, although a plurality of connected components are actually detected due to the poor image quality.
Line 840 of the table shows a single connected component and a plurality of recognized characters forming a merged blob. The connected component 614 is associated with the machine-coded characters +3, which together form the blob. This blob is created by step S253B, for the situation in which the connected component 614 is determined to be associated with a plurality of recognized characters.
Line 850 of the table shows a single connected component and a plurality of recognized characters forming a plurality of divided blobs. The connected component 614 is associated with the machine-coded characters + and 3, which together form the blob. This blob is created by step S253C, for the situation in which the connected component 614 is determined to be associated with a plurality of recognized characters.
BE2019 / 5180 The above examples can also be carried out by a computer of a system or of a device (or of devices such as a CPU or MPU) which reads and executes a program recorded on a device of memory to perform the functions of the examples described above, and by a method the steps of which are performed by a computer of a system or device, for example, reading and executing a program recorded on a memory device to perform the functions of the examples described above. For this purpose, the program is supplied to the computer, for example, via a network or a recording medium of various types serving as a memory device (for example, computer-readable medium such as non-transient computer-readable media).
Although the present invention has been described with reference to embodiments, it should be understood that the invention is not limited to the embodiments disclosed. The present invention can be implemented in various forms without departing from the main features of the present invention. The scope of the following claims is to be interpreted in the broadest sense to encompass all of these modifications and the equivalent structures and functions.

权利要求:
Claims (18)
[1]
Claims
1. Image processing system for locating recognized characters in an image, comprising:
an estimation means configured to estimate a first location of a recognized character which has been obtained by performing character recognition of the image;
determining means configured to determine second locations of a plurality of components connected in the image;
comparison means configured to compare the first location and the second locations, to identify a connected component associated with the recognized character; and association means configured to associate the recognized character, the identified connected component and the second location of the identified connected component.
[2]
2. An image processing system according to claim 1, in which:
the means of estimation;
the means of determination;
the means of comparison; and the association means are configured to perform their functions for each recognized character of a plurality of recognized characters.
[3]
3. An image processing system according to claim 1 or claim 2, further comprising:
a memory configured to store, in association with the image, the recognized character and the second location of the identified connected component.
[4]
4. An image processing system according to any one of the preceding claims, further comprising:
BE2019 / 5180 at least one character recognition means configured to perform character recognition of the image.
[5]
5. An image processing system according to claim 4, in which:
said at least one character recognition means is configured to perform character recognition using a neural network which has been trained to recognize a plurality of character strings.
[6]
6. An image processing system according to any one of the preceding claims, further comprising:
augmenting means configured to augment the image to include the character recognized in the second location.
[7]
7. Image processing system according to any one of the preceding claims, in which:
the comparison means is configured to identify the connected component having a second location which is closest to the first location of the recognized character.
[8]
8. An image processing system according to claim 7, in which:
the comparison means is further configured to identify another connected component which is not associated with any recognized character and to identify the recognized character closest to the other connected component; and the association means is further configured to associate the nearest recognized character, the connected component identified with the closest recognized character, the other connected component and the second location of the identified connected component.
[9]
9. Image processing system according to any one of the preceding claims, in which:
BE2019 / 5180 if the connected component corresponds to a plurality of recognized characters, the association means is configured to associate the plurality of recognized characters and the connected component.
[10]
10. Image processing system according to any one of claims 1 to 8, in which:
if the connected component corresponds to a plurality of recognized characters:
dividing means is configured to divide the connected component into a plurality of parts;
the comparison means is configured to identify a location for each of the parts; and the association means is configured, for each of the parts, to associate the recognized character, the part of the connected component and the location of the part.
[11]
11. Image processing system according to any one of the preceding claims, in which:
the second location is a coordinate of the connected component.
[12]
12. Image processing system according to any one of claims 1 to 10, in which:
the second location is a bounding box that contains the connected component.
[13]
13. An image processing system according to any one of the preceding claims, further comprising:
a selection means configured to allow a user to select the character recognized in the image, at the second location of the corresponding connected component in the image.
BE2019 / 5180
[14]
14. The image processing system according to claim 13, further comprising:
display means configured to display the image; and a display controller configured to allow the user to select the character recognized by user selection at the second locations of the corresponding connected components in the image.
[15]
15. An image processing system according to claim 13 or claim 14, in which:
user selection is made using a mouse, touch screen or keyboard.
[16]
16. Image processing method for locating recognized characters in an image, comprising the steps consisting in:
estimating a first location of a recognized character which has been obtained by carrying out character recognition of the image;
determining second locations of a plurality of components connected in the image;
comparing the first location and the second locations, in order to identify a connected component associated with the recognized character; and associating the recognized character, the identified connected component and the second location of the identified connected component.
[17]
17. A program which, when implemented by an image processing system, causes the image processing system to perform an image processing method according to claim 16.
[18]
18. A computer-readable medium storing a program according to claim 17.

类似技术:

公开号 | 公开日 | 专利标题

US9811749B2|2017-11-07|Detecting a label from an image

US9317764B2|2016-04-19|Text image quality based feedback for improving OCR

US9171204B2|2015-10-27|Method of perspective correction for devanagari text

US20140067631A1|2014-03-06|Systems and Methods for Processing Structured Data from a Document Image

KR101122854B1|2012-03-22|Method and apparatus for populating electronic forms from scanned documents

US8155425B1|2012-04-10|Automated check detection and image cropping

JP5500480B2|2014-05-21|Form recognition device and form recognition method

JP5934762B2|2016-06-15|Document modification detection method by character comparison using character shape characteristics, computer program, recording medium, and information processing apparatus

Tardón et al.2009|Optical music recognition for scores written in white mensural notation

Gebhardt et al.2013|Document authentication using printing technique features and unsupervised anomaly detection

BE1026159B1|2020-05-08|IMAGE PROCESSING SYSTEM AND IMAGE PROCESSING METHOD

US10643094B2|2020-05-05|Method for line and word segmentation for handwritten text images

RU2581786C1|2016-04-20|Determination of image transformations to increase quality of optical character recognition

US20130050765A1|2013-02-28|Method and apparatus for document authentication using image comparison on a block-by-block basis

US20120082372A1|2012-04-05|Automatic document image extraction and comparison

CN102737240B|2014-10-29|Method of analyzing digital document images

BE1026039B1|2019-12-13|IMAGE PROCESSING METHOD AND IMAGE PROCESSING SYSTEM

CN110598566A|2019-12-20|Image processing method, device, terminal and computer readable storage medium

JP2004094427A|2004-03-25|Slip image processor and program for realizing the same device

JP5298830B2|2013-09-25|Image processing program, image processing apparatus, and image processing system

US10878271B2|2020-12-29|Systems and methods for separating ligature characters in digitized document images

KR102167433B1|2020-10-19|Apparatus for automatic character generation based on multi-pattern character image recognition and method thereof

JP5277750B2|2013-08-28|Image processing program, image processing apparatus, and image processing system

Yang et al.2012|A skeleton based binarization approach for video text recognition

JP2009087250A|2009-04-23|Image processor and image processing method

同族专利:

公开号 | 公开日

GB2572386B|2021-05-19|

WO2019185245A3|2019-11-07|

US20190303702A1|2019-10-03|

WO2019185245A2|2019-10-03|

EP3776332A2|2021-02-17|

GB2572386A|2019-10-02|

GB201805039D0|2018-05-09|

BE1026159A1|2019-10-22|

引用文献:

公开号 | 申请日 | 公开日 | 申请人 | 专利标题

EP0684576A2|1994-05-24|1995-11-29|International Business Machines Corporation|Improvements in image processing|

US20150262030A1|2014-03-14|2015-09-17|Omron Corporation|Image processing device, image processing method, and image processing program|

US20160313881A1|2015-04-22|2016-10-27|Xerox Corporation|Copy and paste operation using ocr with integrated correction application|

US4679951A|1979-11-06|1987-07-14|Cornell Research Foundation, Inc.|Electronic keyboard system and method for reproducing selected symbolic language characters|

JP3372005B2|1995-04-28|2003-01-27|松下電器産業株式会社|Character recognition device|

JP4711093B2|2008-08-28|2011-06-29|富士ゼロックス株式会社|Image processing apparatus and image processing program|

AU2017330571B2|2016-09-21|2021-11-18|Gumgum Sports Inc.|Machine learning models for identifying objects depicted in image or video data|US11087448B2|2019-05-30|2021-08-10|Kyocera Document Solutions Inc.|Apparatus, method, and non-transitory recording medium for a document fold determination based on the change point block detection|

CN110991265B|2019-11-13|2022-03-04|四川大学|Layout extraction method for train ticket image|

法律状态:
2020-08-03| FG| Patent granted|Effective date: 20200508 |

优先权:

申请号 | 申请日 | 专利标题

GB1805039.3A|GB2572386B|2018-03-28|2018-03-28|An image processing system and an image processing method|

[返回顶部]